A neural network approach for the design of the target cost function in unit-selection speech synthesis
نویسندگان
چکیده
Corpus-based speech synthesis performance depends on the skill to model and represent appropriately all the characteristics of the speech units that serve as a basis for concatenation. Although there is usually general agreement in the set of essential features (fundamental frequency, duration, power and phonetic context), it is still an open question the proper way of modelling them and considering their respective contributions to the cost functions, specially with regards to those related to the phonetic context. Precisely, this paper presents a new approach for modeling the phonetic context that also simplifies the hard task of training the corresponding weights to the different features in the target cost function.
منابع مشابه
Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملUtilizing a new feed-back fuzzy neural network for solving a system of fuzzy equations
This paper intends to offer a new iterative method based on articial neural networks for finding solution of a fuzzy equations system. Our proposed fuzzied neural network is a ve-layer feedback neural network that corresponding connection weights to output layer are fuzzy numbers. This architecture of articial neural networks, can get a real input vector and calculates its corresponding fuzzy o...
متن کاملThe target cost formulation in unit selection speech synthesis
We review the various approaches that have been used to define the target cost in unit selection speech synthesis and show that there are a number of different and sometimes incompatible ways of defining this. We propose that this cost should be thought of as a measure of how similar two units sound to a human listener. We discuss the issue of what features should be used in unit selection and ...
متن کاملKinematic Synthesis of Parallel Manipulator via Neural Network Approach
In this research, Artificial Neural Networks (ANNs) have been used as a powerful tool to solve the inverse kinematic equations of a parallel robot. For this purpose, we have developed the kinematic equations of a Tricept parallel kinematic mechanism with two rotational and one translational degrees of freedom (DoF). Using the analytical method, the inverse kinematic equations are solved for spe...
متن کاملGoogle's Next-Generation Real-Time Unit-Selection Synthesizer Using Sequence-to-Sequence LSTM-Based Autoencoders
A neural network model that significant improves unitselection-based Text-To-Speech synthesis is presented. The model employs a sequence-to-sequence LSTM-based autoencoder that compresses the acoustic and linguistic features of each unit to a fixed-size vector referred to as an embedding. Unit-selection is facilitated by formulating the target cost as an L2 distance in the embedding space. In o...
متن کامل